AITopics | accuracy and stability

Collaborating Authors

accuracy and stability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Theta-regularized Kriging: Modelling and Algorithms

Xie, Xuelin, Lu, Xiliang

arXiv.org Machine LearningApr-17-2026

To obtain more accurate model parameters and improve prediction accuracy, we proposed a regularized Kriging model that penalizes the hyperparameter theta in the Gaussian stochastic process, termed the Theta-regularized Kriging. We derived the optimization problem for this model from a maximum likelihood perspective. Additionally, we presented specific implementation details for the iterative process, including the regularized optimization algorithm and the geometric search cross-validation tuning algorithm. Three distinct penalty methods, Lasso, Ridge, and Elastic-net regularization, were meticulously considered. Meanwhile, the proposed Theta-regularized Kriging models were tested on nine common numerical functions and two practical engineering examples. The results demonstrate that, compared with other penalized Kriging models, the proposed model performs better in terms of accuracy and stability.

artificial intelligence, machine learning, theta-regularized kriging, (18 more...)

arXiv.org Machine Learning

doi: 10.1016/j.apm.2024.07.034

2604.14975

Country:

Europe > United Kingdom (0.04)
Asia > China > Hubei Province > Wuhan (0.04)
Asia > Middle East > Iran (0.04)
Africa > Middle East > Egypt (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

The Diversified Ensemble Neural Network

Neural Information Processing SystemsDec-24-2025, 12:07:20 GMT

Ensemble is a general way of improving the accuracy and stability of learning models, especially for the generalization ability on small datasets. Compared with tree-based methods, relatively less works have been devoted to an in-depth study on effective ensemble design for neural networks. In this paper, we propose a principled ensemble technique by constructing the so-called diversified ensemble layer to combine multiple networks as individual modules. We theoretically show that each individual model in our ensemble layer corresponds to weights in the ensemble layer optimized in different directions. Meanwhile, the devised ensemble layer can be readily integrated into popular neural architectures, including CNNs, RNNs, and GCNs. Extensive experiments are conducted on public tabular datasets, images, and texts. By adopting weight sharing approach, the results show our method can notably improve the accuracy and stability of the original neural networks with ignorable extra time and space overhead.

diversified ensemble neural network, ensemble layer, name change, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

MOSS: Multi-Objective Optimization for Stable Rule Sets

Liu, Brian, Mazumder, Rahul

arXiv.org Machine LearningJul-31-2025

We present MOSS, a multi-objective optimization framework for constructing stable sets of decision rules. MOSS incorporates three important criteria for interpretability: sparsity, accuracy, and stability, into a single multi-objective optimization framework. Importantly, MOSS allows a practitioner to rapidly evaluate the trade-off between accuracy and stability in sparse rule sets in order to select an appropriate model. We develop a specialized cutting plane algorithm in our framework to rapidly compute the Pareto frontier between these two objectives, and our algorithm scales to problem instances beyond the capabilities of commercial optimization solvers. Our experiments show that MOSS outperforms state-of-the-art rule ensembles in terms of both predictive performance and stability.

artificial intelligence, optimization problem, stability, (15 more...)

arXiv.org Machine Learning

2506.0803

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Pacific Ocean > South Pacific Ocean (0.04)
Oceania > Samoa (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.92)
Education (0.92)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Secure and Private Federated Learning: Achieving Adversarial Resilience through Robust Aggregation

Yang, Kun, Imam, Neena

arXiv.org Artificial IntelligenceJun-5-2025

Federated Learning (FL) enables collaborative machine learning across decentralized data sources without sharing raw data. It offers a promising approach to privacy-preserving AI. However, FL remains vulnerable to adversarial threats from malicious participants, referred to as Byzantine clients, who can send misleading updates to corrupt the global model. Traditional aggregation methods, such as simple averaging, are not robust to such attacks. More resilient approaches, like the Krum algorithm, require prior knowledge of the number of malicious clients, which is often unavailable in real-world scenarios. To address these limitations, we propose Average-rKrum (ArKrum), a novel aggregation strategy designed to enhance both the resilience and privacy guarantees of FL systems. Building on our previous work (rKrum), ArKrum introduces two key innovations. First, it includes a median-based filtering mechanism that removes extreme outliers before estimating the number of adversarial clients. Second, it applies a multi-update averaging scheme to improve stability and performance, particularly when client data distributions are not identical. We evaluate ArKrum on benchmark image and text datasets under three widely studied Byzantine attack types. Results show that ArKrum consistently achieves high accuracy and stability. It performs as well as or better than other robust aggregation methods. These findings demonstrate that ArKrum is an effective and practical solution for secure FL systems in adversarial environments.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2505.17226

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Evaluating and Advancing Multimodal Large Language Models in Ability Lens

Chen, Feng, Gou, Chenhui, Liu, Jing, Yang, Yang, Li, Zhaoyang, Zhang, Jiyuan, Sun, Zhenbang, Zhuang, Bohan, Wu, Qi

arXiv.org Artificial IntelligenceNov-21-2024

As multimodal large language models (MLLMs) advance rapidly, rigorous evaluation has become essential, providing further guidance for their development. In this work, we focus on a unified and robust evaluation of \textbf{vision perception} abilities, the foundational skill of MLLMs. We find that existing perception benchmarks, each focusing on different question types, domains, and evaluation metrics, introduce significant evaluation variance, complicating comprehensive assessments of perception abilities when relying on any single benchmark. To address this, we introduce \textbf{AbilityLens}, a unified benchmark designed to evaluate MLLMs across six key perception abilities, focusing on both accuracy and stability, with each ability encompassing diverse question types, domains, and metrics. With the assistance of AbilityLens, we: (1) identify the strengths and weaknesses of current models, highlighting stability patterns and revealing a notable performance gap between open-source and closed-source models; (2) introduce an online evaluation mode, which uncovers interesting ability conflict and early convergence phenomena during MLLM training; and (3) design a simple ability-specific model merging method that combines the best ability checkpoint from early training stages, effectively mitigating performance decline due to ability conflict. The benchmark and online leaderboard will be released soon.

abilitylen, arxiv preprint arxiv, benchmark, (14 more...)

arXiv.org Artificial Intelligence

2411.14725

Country:

Oceania > Australia (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Diversified Ensemble Neural Network

Neural Information Processing SystemsOct-11-2024, 05:01:19 GMT

accuracy and stability, diversified ensemble neural network, ensemble layer, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Weather Prediction Using CNN-LSTM for Time Series Analysis: A Case Study on Delhi Temperature Data

Li, Bangyu, Qian, Yang

arXiv.org Artificial IntelligenceSep-14-2024

As global climate change intensifies, accurate weather forecasting is increasingly crucial for sectors such as agriculture, energy management, and environmental protection. Traditional methods, which rely on physical and statistical models, often struggle with complex, nonlinear, and time-varying data, underscoring the need for more advanced techniques. This study explores a hybrid CNN-LSTM model to enhance temperature forecasting accuracy for the Delhi region, using historical meteorological data from 1996 to 2017. We employed both direct and indirect methods, including comprehensive data preprocessing and exploratory analysis, to construct and train our model. The CNN component effectively extracts spatial features, while the LSTM captures temporal dependencies, leading to improved prediction accuracy. Experimental results indicate that the CNN-LSTM model significantly outperforms traditional forecasting methods in terms of both accuracy and stability, with a mean square error (MSE) of 3.26217 and a root mean square error (RMSE) of 1.80615. The hybrid model demonstrates its potential as a robust tool for temperature prediction, offering valuable insights for meteorological forecasting and related fields. Future research should focus on optimizing model architecture, exploring additional feature extraction techniques, and addressing challenges such as overfitting and computational complexity. This approach not only advances temperature forecasting but also provides a foundation for applying deep learning to other time series forecasting tasks.

forecasting, prediction, prediction accuracy, (13 more...)

arXiv.org Artificial Intelligence

2409.09414

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.29)
Asia > India > NCT > Delhi (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Streamlining Redundant Layers to Compress Large Language Models

Chen, Xiaodong, Hu, Yuxuan, Zhang, Jing, Wang, Yanling, Li, Cuiping, Chen, Hong

arXiv.org Artificial IntelligenceMay-22-2024

This paper introduces LLM-Streamline, a novel layer pruning approach for large language models. It is based on the observation that different layers have varying impacts on hidden states, enabling the identification of less important layers. LLM-Streamline comprises two parts: layer pruning, which removes consecutive layers with the lowest importance based on target sparsity, and layer replacement, where a lightweight network is trained to replace the pruned layers to mitigate performance loss. Additionally, a new metric called "stability" is proposed to address the limitations of accuracy in evaluating model compression. Experiments show that LLM-Streamline surpasses previous state-of-the-art pruning methods in both accuracy and stability.

arxiv preprint arxiv, benchmark, lightweight network, (11 more...)

arXiv.org Artificial Intelligence

2403.19135

Country: Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

ASI: Accuracy-Stability Index for Evaluating Deep Learning Models

Dai, Wei, Berleant, Daniel

arXiv.org Artificial IntelligenceNov-26-2023

In the context of deep learning research, where model introductions continually occur, the need for effective and efficient evaluation remains paramount. Existing methods often emphasize accuracy metrics, overlooking stability. To address this, the paper introduces the Accuracy-Stability Index (ASI), a quantitative measure incorporating both accuracy and stability for assessing deep learning models. Experimental results demonstrate the application of ASI, and a 3D surface model is presented for visualizing ASI, mean accuracy, and coefficient of variation. This paper addresses the important issue of quantitative benchmarking metrics for deep learning models, providing a new approach for accurately evaluating accuracy and stability of deep learning models. The paper concludes with discussions on potential weaknesses and outlines future research directions.

accuracy, deep learning model, sp0, (16 more...)

arXiv.org Artificial Intelligence

2311.15332

Country:

North America > United States > Indiana > Lake County > Hammond (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > Arkansas > Pulaski County > Little Rock (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

To be or not to be stable, that is the question: understanding neural networks for inverse problems

Evangelista, Davide, Nagy, James, Morotti, Elena, Piccolomini, Elena Loli

arXiv.org Artificial IntelligenceJan-25-2023

The solution of linear inverse problems arising, for example, in signal and image processing is a challenging problem since the ill-conditioning amplifies the noise on the data. Recently introduced algorithms based on deep learning overwhelm the more traditional model-based approaches, but they typically suffer from instability with respect to data perturbation. In this paper, we theoretically analyze the trade-off between neural networks stability and accuracy in the solution of linear inverse problems. Moreover, we propose different supervised and unsupervised solutions to increase network stability that maintains good accuracy by inheriting, in the network training, regularization from a model-based iterative scheme. Extensive numerical experiments on image deblurring confirm the theoretical results and the effectiveness of the proposed deep learning-based solutions to stably solve noisy inverse problems.

artificial intelligence, machine learning, reconstructor, (19 more...)

arXiv.org Artificial Intelligence

2211.13692

Country: Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback